Fundamental frequency as basis for speech segmentation modeling

نویسندگان

  • Francisco Lacerda
  • Ulla Sundberg
  • Ellen Marklund
چکیده

The present study investigates the relevance of fundamental frequency in speech segmentation models intended to simulate infants. Speech from three different conditions (infant-directed speech to 3and 12-month-olds, and adult-directed speech) was segmented based on fundamental frequency information, using a variant of the dpn-gram segmenting technique (highlighting similar segments as lexical candidates). The spectral distance between segments that were found based on fundamental frequency similarity was calculated, and compared to the spectral distance between segments that were found using transcription as basis for segmentation, as well as to the spectral distance between randomly paired segments from the same speech materials. The results show the greatest within-condition difference in speech directed to 3-month-olds, in which segmenting based on fundamental frequency similarity generated segment pairs with smaller spectral distance than did transcription-based segmentation or random segment pairs. Speech directed to 12-month-olds resulted in a somewhat smaller difference when using fundamental frequency data compared to when using transcriptions. For adult-directed speech, no difference was found in spectral distance between pairs generated by the different bases for segmentation. Neither segmenting speech by highlighting similar segments as lexical candidates, nor using fundamental frequency as basis for segmentation is optimal for a speech segmentation model intended to simulate 12-montholds or adults. These groups are more likely to segment speech based on their already present or growing linguistic experience than on acoustic similarity only. However, for a model simulating a 3-month-old infant, the present segmentation procedure and its basis for segmentation are more plausible. When modeling speech segmentation in an infant-like manner it is important to take into account both that the cognitive abilities of infants develop rapidly during the first year of life, and that some aspects of their linguistic environment vary during this period.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarities in fundamental frequency in infant speech segmentation models

The present study investigates fundamental frequency as a potential basis for segmentation in models of infant speech segmentation. Pairs of segments that were similar either in terms of fundamental frequency envelop or in terms of transcribed content were found in three different speech styles: speech directed to three-month-olds, speech directed to twelve-montholds and speech directed to adul...

متن کامل

A Snack Implementation and Tcl/Tk Interface to the Fundamental Frequency Variation Spectrum Algorithm

Intonation is an important aspect of vocal production, used for a variety of communicative needs. Its modeling is therefore crucial in many speech understanding systems, particularly those requiring inference of speaker intent in real-time. However, the estimation of pitch, traditionally the first step in intonation modeling, is computationally inconvenient in such scenarios. This is because it...

متن کامل

بررسی برخی ویژگی های آکوستیک گفتار نوزاد مدار در مادران فارسی زبان

Introduction: When adults talk to another person, linguistic characteristics of the listener will also be considered. A clear example of speech changes depending on the listener is maternal or infant directed speech. Infant directed speech is more slowly with longer sentences and pauses at the end of the utterance. Undoubtedly the most distinctive feature of this style of speech is acoustic c...

متن کامل

First and second language similarity can hurt the learning of second-language speech segmentation: The case of prosody

This study investigates whether learning to use prosodic cues to word boundaries in second-language speech segmentation is easier or more difficult if the native and second languages have similar (though non-identical) prosodies than if they have markedly different prosodies. It compares French, Korean, and English listeners’ use of fundamental-frequency rise and lengthening as cues to word-fin...

متن کامل

Continuous Speech Recognition of Japanese Using Prosodic Word Boundaries Detected by Mora Transition Modeling of Fundamental Frequency Contours

An HMM-based method of detecting prosodic word boundaries was developed for Japanese continuous speech and was successfully integrated into a mora-basis continuous speech recognition system with two stages operating without and with prosodic information. The method is based on modeling the fundamental frequency (F0) contour of input speech as transitions of mora-unit F0 contours and operates af...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011